Overview

Dataset statistics

Number of variables29
Number of observations2075427
Missing cells17761579
Missing cells (%)29.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory459.2 MiB
Average record size in memory232.0 B

Variable types

DateTime2
Categorical6
Unsupported1
Numeric8
Text12

Alerts

NUMBER OF PEDESTRIANS KILLED is highly imbalanced (99.6%)Imbalance
NUMBER OF CYCLIST INJURED is highly imbalanced (92.3%)Imbalance
NUMBER OF CYCLIST KILLED is highly imbalanced (99.9%)Imbalance
CONTRIBUTING FACTOR VEHICLE 4 is highly imbalanced (90.8%)Imbalance
CONTRIBUTING FACTOR VEHICLE 5 is highly imbalanced (89.9%)Imbalance
BOROUGH has 645746 (31.1%) missing valuesMissing
ZIP CODE has 645996 (31.1%) missing valuesMissing
LATITUDE has 233626 (11.3%) missing valuesMissing
LONGITUDE has 233626 (11.3%) missing valuesMissing
LOCATION has 233626 (11.3%) missing valuesMissing
ON STREET NAME has 440569 (21.2%) missing valuesMissing
CROSS STREET NAME has 784436 (37.8%) missing valuesMissing
OFF STREET NAME has 1727231 (83.2%) missing valuesMissing
CONTRIBUTING FACTOR VEHICLE 2 has 321736 (15.5%) missing valuesMissing
CONTRIBUTING FACTOR VEHICLE 3 has 1927163 (92.9%) missing valuesMissing
CONTRIBUTING FACTOR VEHICLE 4 has 2041953 (98.4%) missing valuesMissing
CONTRIBUTING FACTOR VEHICLE 5 has 2066358 (99.6%) missing valuesMissing
VEHICLE TYPE CODE 2 has 396691 (19.1%) missing valuesMissing
VEHICLE TYPE CODE 3 has 1932530 (93.1%) missing valuesMissing
VEHICLE TYPE CODE 4 has 2043115 (98.4%) missing valuesMissing
VEHICLE TYPE CODE 5 has 2066635 (99.6%) missing valuesMissing
LATITUDE is highly skewed (γ1 = -20.43042564)Skewed
NUMBER OF PERSONS KILLED is highly skewed (γ1 = 33.71743399)Skewed
NUMBER OF MOTORIST KILLED is highly skewed (γ1 = 54.74414747)Skewed
COLLISION_ID has unique valuesUnique
ZIP CODE is an unsupported type, check if it needs cleaning or further analysisUnsupported
NUMBER OF PERSONS INJURED has 1601221 (77.2%) zerosZeros
NUMBER OF PERSONS KILLED has 2072415 (99.9%) zerosZeros
NUMBER OF PEDESTRIANS INJURED has 1962919 (94.6%) zerosZeros
NUMBER OF MOTORIST INJURED has 1772939 (85.4%) zerosZeros
NUMBER OF MOTORIST KILLED has 2074246 (99.9%) zerosZeros

Reproduction

Analysis started2024-03-26 23:40:06.032075
Analysis finished2024-03-26 23:41:54.606460
Duration1 minute and 48.57 seconds
Software versionydata-profiling vv4.7.0
Download configurationconfig.json

Variables

Distinct4283
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.8 MiB
Minimum2012-07-01 00:00:00
Maximum2024-03-22 00:00:00
2024-03-26T23:41:54.723198image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:54.909435image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1440
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 MiB
Minimum2024-03-26 00:00:00
Maximum2024-03-26 23:59:00
2024-03-26T23:41:55.130530image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:55.314605image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

BOROUGH
Categorical

MISSING 

Distinct5
Distinct (%)< 0.1%
Missing645746
Missing (%)31.1%
Memory size15.8 MiB
BROOKLYN
454727 
QUEENS
383365 
MANHATTAN
320242 
BRONX
211335 
STATEN ISLAND
60012 

Length

Max length13
Median length9
Mean length7.4541209
Min length5

Characters and Unicode

Total characters10657015
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBROOKLYN
2nd rowBROOKLYN
3rd rowBRONX
4th rowBROOKLYN
5th rowMANHATTAN

Common Values

ValueCountFrequency (%)
BROOKLYN 454727
21.9%
QUEENS 383365
18.5%
MANHATTAN 320242
15.4%
BRONX 211335
 
10.2%
STATEN ISLAND 60012
 
2.9%
(Missing) 645746
31.1%

Length

2024-03-26T23:41:55.485249image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-26T23:41:55.692420image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
brooklyn 454727
30.5%
queens 383365
25.7%
manhattan 320242
21.5%
bronx 211335
14.2%
staten 60012
 
4.0%
island 60012
 
4.0%

Most occurring characters

ValueCountFrequency (%)
N 1809935
17.0%
O 1120789
10.5%
A 1080750
10.1%
E 826742
 
7.8%
T 760508
 
7.1%
R 666062
 
6.2%
B 666062
 
6.2%
L 514739
 
4.8%
S 503389
 
4.7%
Y 454727
 
4.3%
Other values (9) 2253312
21.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 10657015
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 1809935
17.0%
O 1120789
10.5%
A 1080750
10.1%
E 826742
 
7.8%
T 760508
 
7.1%
R 666062
 
6.2%
B 666062
 
6.2%
L 514739
 
4.8%
S 503389
 
4.7%
Y 454727
 
4.3%
Other values (9) 2253312
21.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 10657015
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 1809935
17.0%
O 1120789
10.5%
A 1080750
10.1%
E 826742
 
7.8%
T 760508
 
7.1%
R 666062
 
6.2%
B 666062
 
6.2%
L 514739
 
4.8%
S 503389
 
4.7%
Y 454727
 
4.3%
Other values (9) 2253312
21.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 10657015
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 1809935
17.0%
O 1120789
10.5%
A 1080750
10.1%
E 826742
 
7.8%
T 760508
 
7.1%
R 666062
 
6.2%
B 666062
 
6.2%
L 514739
 
4.8%
S 503389
 
4.7%
Y 454727
 
4.3%
Other values (9) 2253312
21.1%

ZIP CODE
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing645996
Missing (%)31.1%
Memory size15.8 MiB

LATITUDE
Real number (ℝ)

MISSING  SKEWED 

Distinct126594
Distinct (%)6.9%
Missing233626
Missing (%)11.3%
Infinite0
Infinite (%)0.0%
Mean40.627693
Minimum0
Maximum43.344444
Zeros4360
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size15.8 MiB
2024-03-26T23:41:55.881574image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40.596622
Q140.6678
median40.72083
Q340.769592
95-th percentile40.86205
Maximum43.344444
Range43.344444
Interquartile range (IQR)0.101792

Descriptive statistics

Standard deviation1.9806568
Coefficient of variation (CV)0.048751397
Kurtosis416.08064
Mean40.627693
Median Absolute Deviation (MAD)0.051354
Skewness-20.430426
Sum74828126
Variance3.9230014
MonotonicityNot monotonic
2024-03-26T23:41:56.074021image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4360
 
0.2%
40.861862 883
 
< 0.1%
40.696033 762
 
< 0.1%
40.8047 692
 
< 0.1%
40.608757 671
 
< 0.1%
40.798256 627
 
< 0.1%
40.759308 622
 
< 0.1%
40.6960346 587
 
< 0.1%
40.675735 557
 
< 0.1%
40.658577 520
 
< 0.1%
Other values (126584) 1831520
88.2%
(Missing) 233626
 
11.3%
ValueCountFrequency (%)
0 4360
0.2%
30.78418 1
 
< 0.1%
34.783634 1
 
< 0.1%
40.4989488 2
 
< 0.1%
40.4991346 1
 
< 0.1%
40.49931 1
 
< 0.1%
40.4994787 1
 
< 0.1%
40.499659 1
 
< 0.1%
40.49971 1
 
< 0.1%
40.49984 1
 
< 0.1%
ValueCountFrequency (%)
43.344444 1
 
< 0.1%
42.64154 1
 
< 0.1%
42.318317 1
 
< 0.1%
42.107204 1
 
< 0.1%
41.91661 1
 
< 0.1%
41.34796 1
 
< 0.1%
41.258785 1
 
< 0.1%
41.12615 5
< 0.1%
41.12421 1
 
< 0.1%
41.061634 2
 
< 0.1%

LONGITUDE
Real number (ℝ)

MISSING 

Distinct98351
Distinct (%)5.3%
Missing233626
Missing (%)11.3%
Infinite0
Infinite (%)0.0%
Mean-73.752129
Minimum-201.35999
Maximum0
Zeros4360
Zeros (%)0.2%
Negative1837441
Negative (%)88.5%
Memory size15.8 MiB
2024-03-26T23:41:56.277132image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-201.35999
5-th percentile-74.03607
Q1-73.97484
median-73.92726
Q3-73.866731
95-th percentile-73.763239
Maximum0
Range201.35999
Interquartile range (IQR)0.1081089

Descriptive statistics

Standard deviation3.7233454
Coefficient of variation (CV)-0.050484581
Kurtosis440.66
Mean-73.752129
Median Absolute Deviation (MAD)0.0526217
Skewness16.099628
Sum-1.3583675 × 108
Variance13.863301
MonotonicityNot monotonic
2024-03-26T23:41:56.471238image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4360
 
0.2%
-73.89063 763
 
< 0.1%
-73.91282 719
 
< 0.1%
-73.98453 699
 
< 0.1%
-74.038086 672
 
< 0.1%
-73.89686 657
 
< 0.1%
-73.91243 654
 
< 0.1%
-73.9845292 587
 
< 0.1%
-73.94476 583
 
< 0.1%
-73.9112 576
 
< 0.1%
Other values (98341) 1831531
88.2%
(Missing) 233626
 
11.3%
ValueCountFrequency (%)
-201.35999 1
 
< 0.1%
-201.23706 105
< 0.1%
-89.13527 1
 
< 0.1%
-86.76847 1
 
< 0.1%
-79.61955 1
 
< 0.1%
-79.00183 1
 
< 0.1%
-76.2634 1
 
< 0.1%
-76.02163 1
 
< 0.1%
-74.742 7
 
< 0.1%
-74.25496 1
 
< 0.1%
ValueCountFrequency (%)
0 4360
0.2%
-32.768513 16
 
< 0.1%
-47.209625 3
 
< 0.1%
-73.66301 1
 
< 0.1%
-73.70055 2
 
< 0.1%
-73.700584 11
 
< 0.1%
-73.7005968 10
 
< 0.1%
-73.70061 4
 
< 0.1%
-73.70071 4
 
< 0.1%
-73.70073 1
 
< 0.1%

LOCATION
Text

MISSING 

Distinct283006
Distinct (%)15.4%
Missing233626
Missing (%)11.3%
Memory size15.8 MiB
2024-03-26T23:41:56.899591image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length25
Median length24
Mean length22.779989
Min length10

Characters and Unicode

Total characters41956206
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique155498 ?
Unique (%)8.4%

Sample

1st row(40.667202, -73.8665)
2nd row(40.683304, -73.917274)
3rd row(40.709183, -73.956825)
4th row(40.86816, -73.83148)
5th row(40.67172, -73.8971)
ValueCountFrequency (%)
0.0 8720
 
0.2%
40.861862 883
 
< 0.1%
73.89063 763
 
< 0.1%
40.696033 762
 
< 0.1%
73.91282 719
 
< 0.1%
73.98453 699
 
< 0.1%
40.8047 692
 
< 0.1%
74.038086 672
 
< 0.1%
40.608757 671
 
< 0.1%
73.89686 657
 
< 0.1%
Other values (224934) 3668364
99.6%
2024-03-26T23:41:57.526673image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 4595577
11.0%
4 3980471
 
9.5%
. 3683602
 
8.8%
3 3498540
 
8.3%
0 3400841
 
8.1%
9 2700094
 
6.4%
8 2648683
 
6.3%
6 2616640
 
6.2%
5 2094509
 
5.0%
( 1841801
 
4.4%
Other values (6) 10895448
26.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 41956206
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
7 4595577
11.0%
4 3980471
 
9.5%
. 3683602
 
8.8%
3 3498540
 
8.3%
0 3400841
 
8.1%
9 2700094
 
6.4%
8 2648683
 
6.3%
6 2616640
 
6.2%
5 2094509
 
5.0%
( 1841801
 
4.4%
Other values (6) 10895448
26.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 41956206
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
7 4595577
11.0%
4 3980471
 
9.5%
. 3683602
 
8.8%
3 3498540
 
8.3%
0 3400841
 
8.1%
9 2700094
 
6.4%
8 2648683
 
6.3%
6 2616640
 
6.2%
5 2094509
 
5.0%
( 1841801
 
4.4%
Other values (6) 10895448
26.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 41956206
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
7 4595577
11.0%
4 3980471
 
9.5%
. 3683602
 
8.8%
3 3498540
 
8.3%
0 3400841
 
8.1%
9 2700094
 
6.4%
8 2648683
 
6.3%
6 2616640
 
6.2%
5 2094509
 
5.0%
( 1841801
 
4.4%
Other values (6) 10895448
26.0%

ON STREET NAME
Text

MISSING 

Distinct18410
Distinct (%)1.1%
Missing440569
Missing (%)21.2%
Memory size15.8 MiB
2024-03-26T23:41:57.918858image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length32
Median length32
Mean length29.630325
Min length2

Characters and Unicode

Total characters48441374
Distinct characters75
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6537 ?
Unique (%)0.4%

Sample

1st rowWHITESTONE EXPRESSWAY
2nd rowQUEENSBORO BRIDGE UPPER
3rd rowTHROGS NECK BRIDGE
4th rowSARATOGA AVENUE
5th rowMAJOR DEEGAN EXPRESSWAY RAMP
ValueCountFrequency (%)
avenue 608264
 
16.1%
street 520901
 
13.8%
east 153481
 
4.1%
boulevard 127014
 
3.4%
west 114792
 
3.0%
parkway 74643
 
2.0%
road 68123
 
1.8%
expressway 63293
 
1.7%
island 30410
 
0.8%
queens 27154
 
0.7%
Other values (5393) 1983965
52.6%
2024-03-26T23:41:58.809169image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27562630
56.9%
E 3672854
 
7.6%
A 1951050
 
4.0%
T 1831929
 
3.8%
R 1669600
 
3.4%
N 1427915
 
2.9%
S 1407885
 
2.9%
U 977757
 
2.0%
O 868930
 
1.8%
V 852133
 
1.8%
Other values (65) 6218691
 
12.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 48441374
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
27562630
56.9%
E 3672854
 
7.6%
A 1951050
 
4.0%
T 1831929
 
3.8%
R 1669600
 
3.4%
N 1427915
 
2.9%
S 1407885
 
2.9%
U 977757
 
2.0%
O 868930
 
1.8%
V 852133
 
1.8%
Other values (65) 6218691
 
12.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 48441374
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
27562630
56.9%
E 3672854
 
7.6%
A 1951050
 
4.0%
T 1831929
 
3.8%
R 1669600
 
3.4%
N 1427915
 
2.9%
S 1407885
 
2.9%
U 977757
 
2.0%
O 868930
 
1.8%
V 852133
 
1.8%
Other values (65) 6218691
 
12.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 48441374
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
27562630
56.9%
E 3672854
 
7.6%
A 1951050
 
4.0%
T 1831929
 
3.8%
R 1669600
 
3.4%
N 1427915
 
2.9%
S 1407885
 
2.9%
U 977757
 
2.0%
O 868930
 
1.8%
V 852133
 
1.8%
Other values (65) 6218691
 
12.8%

CROSS STREET NAME
Text

MISSING 

Distinct20236
Distinct (%)1.6%
Missing784436
Missing (%)37.8%
Memory size15.8 MiB
2024-03-26T23:41:59.236385image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length32
Median length32
Mean length22.706216
Min length1

Characters and Unicode

Total characters29313520
Distinct characters76
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6201 ?
Unique (%)0.5%

Sample

1st row20 AVENUE
2nd rowDECATUR STREET
3rd rowEAST 43 STREET
4th rowEAST GATE PLAZA
5th rowwest 80 street -west 81 street
ValueCountFrequency (%)
avenue 565307
 
19.8%
street 459527
 
16.1%
east 112172
 
3.9%
west 71155
 
2.5%
boulevard 68647
 
2.4%
road 55544
 
1.9%
place 33946
 
1.2%
parkway 26605
 
0.9%
3 18757
 
0.7%
park 17426
 
0.6%
Other values (5483) 1426325
50.0%
2024-03-26T23:41:59.837357image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14115616
48.2%
E 2937153
 
10.0%
T 1453458
 
5.0%
A 1419427
 
4.8%
R 1147248
 
3.9%
N 1074756
 
3.7%
S 988831
 
3.4%
U 777244
 
2.7%
V 708819
 
2.4%
O 578382
 
2.0%
Other values (66) 4112586
 
14.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 29313520
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
14115616
48.2%
E 2937153
 
10.0%
T 1453458
 
5.0%
A 1419427
 
4.8%
R 1147248
 
3.9%
N 1074756
 
3.7%
S 988831
 
3.4%
U 777244
 
2.7%
V 708819
 
2.4%
O 578382
 
2.0%
Other values (66) 4112586
 
14.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 29313520
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
14115616
48.2%
E 2937153
 
10.0%
T 1453458
 
5.0%
A 1419427
 
4.8%
R 1147248
 
3.9%
N 1074756
 
3.7%
S 988831
 
3.4%
U 777244
 
2.7%
V 708819
 
2.4%
O 578382
 
2.0%
Other values (66) 4112586
 
14.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 29313520
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
14115616
48.2%
E 2937153
 
10.0%
T 1453458
 
5.0%
A 1419427
 
4.8%
R 1147248
 
3.9%
N 1074756
 
3.7%
S 988831
 
3.4%
U 777244
 
2.7%
V 708819
 
2.4%
O 578382
 
2.0%
Other values (66) 4112586
 
14.0%

OFF STREET NAME
Text

MISSING 

Distinct225845
Distinct (%)64.9%
Missing1727231
Missing (%)83.2%
Memory size15.8 MiB
2024-03-26T23:42:00.297093image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length40
Median length40
Mean length36.021158
Min length8

Characters and Unicode

Total characters12542423
Distinct characters84
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique176197 ?
Unique (%)50.6%

Sample

1st row1211 LORING AVENUE
2nd row344 BAYCHESTER AVENUE
3rd row2047 PITKIN AVENUE
4th row480 DEAN STREET
5th row878 FLATBUSH AVENUE
ValueCountFrequency (%)
avenue 137975
 
11.9%
street 125856
 
10.9%
east 33204
 
2.9%
west 23966
 
2.1%
boulevard 22127
 
1.9%
road 16430
 
1.4%
lot 7881
 
0.7%
parking 7267
 
0.6%
of 6949
 
0.6%
parkway 6943
 
0.6%
Other values (27589) 769819
66.5%
2024-03-26T23:42:00.935992image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6866584
54.7%
E 796771
 
6.4%
T 436287
 
3.5%
A 408734
 
3.3%
R 339643
 
2.7%
N 298626
 
2.4%
S 285926
 
2.3%
1 276924
 
2.2%
U 203017
 
1.6%
V 189426
 
1.5%
Other values (74) 2440485
 
19.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 12542423
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
6866584
54.7%
E 796771
 
6.4%
T 436287
 
3.5%
A 408734
 
3.3%
R 339643
 
2.7%
N 298626
 
2.4%
S 285926
 
2.3%
1 276924
 
2.2%
U 203017
 
1.6%
V 189426
 
1.5%
Other values (74) 2440485
 
19.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 12542423
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
6866584
54.7%
E 796771
 
6.4%
T 436287
 
3.5%
A 408734
 
3.3%
R 339643
 
2.7%
N 298626
 
2.4%
S 285926
 
2.3%
1 276924
 
2.2%
U 203017
 
1.6%
V 189426
 
1.5%
Other values (74) 2440485
 
19.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 12542423
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
6866584
54.7%
E 796771
 
6.4%
T 436287
 
3.5%
A 408734
 
3.3%
R 339643
 
2.7%
N 298626
 
2.4%
S 285926
 
2.3%
1 276924
 
2.2%
U 203017
 
1.6%
V 189426
 
1.5%
Other values (74) 2440485
 
19.5%

NUMBER OF PERSONS INJURED
Real number (ℝ)

ZEROS 

Distinct32
Distinct (%)< 0.1%
Missing18
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.30980159
Minimum0
Maximum43
Zeros1601221
Zeros (%)77.2%
Negative0
Negative (%)0.0%
Memory size15.8 MiB
2024-03-26T23:42:01.140623image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum43
Range43
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.69996885
Coefficient of variation (CV)2.2594102
Kurtosis51.296075
Mean0.30980159
Median Absolute Deviation (MAD)0
Skewness4.2602307
Sum642965
Variance0.4899564
MonotonicityNot monotonic
2024-03-26T23:42:01.300597image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
0 1601221
77.2%
1 368039
 
17.7%
2 69310
 
3.3%
3 22649
 
1.1%
4 8403
 
0.4%
5 3225
 
0.2%
6 1350
 
0.1%
7 574
 
< 0.1%
8 252
 
< 0.1%
9 129
 
< 0.1%
Other values (22) 257
 
< 0.1%
ValueCountFrequency (%)
0 1601221
77.2%
1 368039
 
17.7%
2 69310
 
3.3%
3 22649
 
1.1%
4 8403
 
0.4%
5 3225
 
0.2%
6 1350
 
0.1%
7 574
 
< 0.1%
8 252
 
< 0.1%
9 129
 
< 0.1%
ValueCountFrequency (%)
43 1
 
< 0.1%
40 1
 
< 0.1%
34 1
 
< 0.1%
32 1
 
< 0.1%
31 1
 
< 0.1%
27 1
 
< 0.1%
25 1
 
< 0.1%
24 3
< 0.1%
23 1
 
< 0.1%
22 3
< 0.1%

NUMBER OF PERSONS KILLED
Real number (ℝ)

SKEWED  ZEROS 

Distinct7
Distinct (%)< 0.1%
Missing31
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.0014951363
Minimum0
Maximum8
Zeros2072415
Zeros (%)99.9%
Negative0
Negative (%)0.0%
Memory size15.8 MiB
2024-03-26T23:42:01.442835image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.040773863
Coefficient of variation (CV)27.271
Kurtosis1937.399
Mean0.0014951363
Median Absolute Deviation (MAD)0
Skewness33.717434
Sum3103
Variance0.0016625079
MonotonicityNot monotonic
2024-03-26T23:42:01.575934image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 2072415
99.9%
1 2889
 
0.1%
2 74
 
< 0.1%
3 12
 
< 0.1%
4 3
 
< 0.1%
5 2
 
< 0.1%
8 1
 
< 0.1%
(Missing) 31
 
< 0.1%
ValueCountFrequency (%)
0 2072415
99.9%
1 2889
 
0.1%
2 74
 
< 0.1%
3 12
 
< 0.1%
4 3
 
< 0.1%
5 2
 
< 0.1%
8 1
 
< 0.1%
ValueCountFrequency (%)
8 1
 
< 0.1%
5 2
 
< 0.1%
4 3
 
< 0.1%
3 12
 
< 0.1%
2 74
 
< 0.1%
1 2889
 
0.1%
0 2072415
99.9%

NUMBER OF PEDESTRIANS INJURED
Real number (ℝ)

ZEROS 

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.056549327
Minimum0
Maximum27
Zeros1962919
Zeros (%)94.6%
Negative0
Negative (%)0.0%
Memory size15.8 MiB
2024-03-26T23:42:01.728162image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum27
Range27
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2440835
Coefficient of variation (CV)4.3162936
Kurtosis129.0936
Mean0.056549327
Median Absolute Deviation (MAD)0
Skewness5.6862516
Sum117364
Variance0.059576754
MonotonicityNot monotonic
2024-03-26T23:42:01.883639image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
0 1962919
94.6%
1 108371
 
5.2%
2 3663
 
0.2%
3 365
 
< 0.1%
4 60
 
< 0.1%
5 26
 
< 0.1%
6 11
 
< 0.1%
7 4
 
< 0.1%
9 2
 
< 0.1%
8 2
 
< 0.1%
Other values (4) 4
 
< 0.1%
ValueCountFrequency (%)
0 1962919
94.6%
1 108371
 
5.2%
2 3663
 
0.2%
3 365
 
< 0.1%
4 60
 
< 0.1%
5 26
 
< 0.1%
6 11
 
< 0.1%
7 4
 
< 0.1%
8 2
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
27 1
 
< 0.1%
19 1
 
< 0.1%
15 1
 
< 0.1%
13 1
 
< 0.1%
9 2
 
< 0.1%
8 2
 
< 0.1%
7 4
 
< 0.1%
6 11
 
< 0.1%
5 26
< 0.1%
4 60
< 0.1%

NUMBER OF PEDESTRIANS KILLED
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.8 MiB
0
2073905 
1
 
1509
2
 
12
6
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2075427
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2073905
99.9%
1 1509
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Length

2024-03-26T23:42:02.050829image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-26T23:42:02.220062image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 2073905
99.9%
1 1509
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 2073905
99.9%
1 1509
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2075427
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2073905
99.9%
1 1509
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2075427
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2073905
99.9%
1 1509
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2075427
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2073905
99.9%
1 1509
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

NUMBER OF CYCLIST INJURED
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.8 MiB
0
2020463 
1
 
54340
2
 
600
3
 
23
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2075427
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2020463
97.4%
1 54340
 
2.6%
2 600
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Length

2024-03-26T23:42:02.378144image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-26T23:42:02.549752image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 2020463
97.4%
1 54340
 
2.6%
2 600
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 2020463
97.4%
1 54340
 
2.6%
2 600
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2075427
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2020463
97.4%
1 54340
 
2.6%
2 600
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2075427
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2020463
97.4%
1 54340
 
2.6%
2 600
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2075427
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2020463
97.4%
1 54340
 
2.6%
2 600
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

NUMBER OF CYCLIST KILLED
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.8 MiB
0
2075189 
1
 
237
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2075427
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2075189
> 99.9%
1 237
 
< 0.1%
2 1
 
< 0.1%

Length

2024-03-26T23:42:02.692013image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-26T23:42:02.846924image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 2075189
> 99.9%
1 237
 
< 0.1%
2 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 2075189
> 99.9%
1 237
 
< 0.1%
2 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2075427
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2075189
> 99.9%
1 237
 
< 0.1%
2 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2075427
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2075189
> 99.9%
1 237
 
< 0.1%
2 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2075427
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2075189
> 99.9%
1 237
 
< 0.1%
2 1
 
< 0.1%

NUMBER OF MOTORIST INJURED
Real number (ℝ)

ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.22282162
Minimum0
Maximum43
Zeros1772939
Zeros (%)85.4%
Negative0
Negative (%)0.0%
Memory size15.8 MiB
2024-03-26T23:42:02.999198image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum43
Range43
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.66109218
Coefficient of variation (CV)2.9669122
Kurtosis63.717057
Mean0.22282162
Median Absolute Deviation (MAD)0
Skewness5.1266596
Sum462450
Variance0.43704287
MonotonicityNot monotonic
2024-03-26T23:42:03.151919image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 1772939
85.4%
1 203426
 
9.8%
2 63230
 
3.0%
3 21961
 
1.1%
4 8230
 
0.4%
5 3175
 
0.2%
6 1304
 
0.1%
7 548
 
< 0.1%
8 245
 
< 0.1%
9 123
 
< 0.1%
Other values (21) 246
 
< 0.1%
ValueCountFrequency (%)
0 1772939
85.4%
1 203426
 
9.8%
2 63230
 
3.0%
3 21961
 
1.1%
4 8230
 
0.4%
5 3175
 
0.2%
6 1304
 
0.1%
7 548
 
< 0.1%
8 245
 
< 0.1%
9 123
 
< 0.1%
ValueCountFrequency (%)
43 1
 
< 0.1%
40 1
 
< 0.1%
34 1
 
< 0.1%
31 1
 
< 0.1%
30 1
 
< 0.1%
25 1
 
< 0.1%
24 3
< 0.1%
23 1
 
< 0.1%
22 2
< 0.1%
21 1
 
< 0.1%

NUMBER OF MOTORIST KILLED
Real number (ℝ)

SKEWED  ZEROS 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.00061529507
Minimum0
Maximum5
Zeros2074246
Zeros (%)99.9%
Negative0
Negative (%)0.0%
Memory size15.8 MiB
2024-03-26T23:42:03.300692image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.027135542
Coefficient of variation (CV)44.101673
Kurtosis4230.0939
Mean0.00061529507
Median Absolute Deviation (MAD)0
Skewness54.744147
Sum1277
Variance0.00073633763
MonotonicityNot monotonic
2024-03-26T23:42:03.434583image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 2074246
99.9%
1 1107
 
0.1%
2 58
 
< 0.1%
3 12
 
< 0.1%
4 2
 
< 0.1%
5 2
 
< 0.1%
ValueCountFrequency (%)
0 2074246
99.9%
1 1107
 
0.1%
2 58
 
< 0.1%
3 12
 
< 0.1%
4 2
 
< 0.1%
5 2
 
< 0.1%
ValueCountFrequency (%)
5 2
 
< 0.1%
4 2
 
< 0.1%
3 12
 
< 0.1%
2 58
 
< 0.1%
1 1107
 
0.1%
0 2074246
99.9%
Distinct61
Distinct (%)< 0.1%
Missing6802
Missing (%)0.3%
Memory size15.8 MiB
2024-03-26T23:42:03.775629image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length53
Median length43
Mean length19.504495
Min length1

Characters and Unicode

Total characters40347485
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAggressive Driving/Road Rage
2nd rowPavement Slippery
3rd rowFollowing Too Closely
4th rowUnspecified
5th rowUnspecified
ValueCountFrequency (%)
unspecified 706732
17.1%
driver 447768
 
10.9%
inattention/distraction 415252
 
10.1%
too 162593
 
3.9%
closely 162593
 
3.9%
to 148089
 
3.6%
failure 129495
 
3.1%
yield 123304
 
3.0%
right-of-way 123304
 
3.0%
following 110930
 
2.7%
Other values (96) 1591210
38.6%
2024-03-26T23:42:04.469957image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 4541258
 
11.3%
e 4110099
 
10.2%
n 3507152
 
8.7%
t 2798284
 
6.9%
o 2379399
 
5.9%
r 2368411
 
5.9%
s 2097469
 
5.2%
2052645
 
5.1%
a 1989702
 
4.9%
c 1555120
 
3.9%
Other values (45) 12947946
32.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 40347485
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 4541258
 
11.3%
e 4110099
 
10.2%
n 3507152
 
8.7%
t 2798284
 
6.9%
o 2379399
 
5.9%
r 2368411
 
5.9%
s 2097469
 
5.2%
2052645
 
5.1%
a 1989702
 
4.9%
c 1555120
 
3.9%
Other values (45) 12947946
32.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 40347485
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 4541258
 
11.3%
e 4110099
 
10.2%
n 3507152
 
8.7%
t 2798284
 
6.9%
o 2379399
 
5.9%
r 2368411
 
5.9%
s 2097469
 
5.2%
2052645
 
5.1%
a 1989702
 
4.9%
c 1555120
 
3.9%
Other values (45) 12947946
32.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 40347485
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 4541258
 
11.3%
e 4110099
 
10.2%
n 3507152
 
8.7%
t 2798284
 
6.9%
o 2379399
 
5.9%
r 2368411
 
5.9%
s 2097469
 
5.2%
2052645
 
5.1%
a 1989702
 
4.9%
c 1555120
 
3.9%
Other values (45) 12947946
32.1%
Distinct61
Distinct (%)< 0.1%
Missing321736
Missing (%)15.5%
Memory size15.8 MiB
2024-03-26T23:42:04.766737image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length53
Median length11
Mean length13.048611
Min length1

Characters and Unicode

Total characters22883231
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified
ValueCountFrequency (%)
unspecified 1476469
68.6%
driver 100961
 
4.7%
inattention/distraction 94252
 
4.4%
other 33129
 
1.5%
vehicular 32066
 
1.5%
too 27733
 
1.3%
closely 27733
 
1.3%
passing 21554
 
1.0%
to 21532
 
1.0%
lane 20107
 
0.9%
Other values (96) 295716
 
13.7%
2024-03-26T23:42:05.470687image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3607436
15.8%
e 3511207
15.3%
n 2050908
9.0%
s 1757954
7.7%
c 1666318
7.3%
d 1549984
6.8%
p 1546225
6.8%
f 1532577
6.7%
U 1512982
6.6%
t 619191
 
2.7%
Other values (45) 3528449
15.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22883231
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 3607436
15.8%
e 3511207
15.3%
n 2050908
9.0%
s 1757954
7.7%
c 1666318
7.3%
d 1549984
6.8%
p 1546225
6.8%
f 1532577
6.7%
U 1512982
6.6%
t 619191
 
2.7%
Other values (45) 3528449
15.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22883231
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 3607436
15.8%
e 3511207
15.3%
n 2050908
9.0%
s 1757954
7.7%
c 1666318
7.3%
d 1549984
6.8%
p 1546225
6.8%
f 1532577
6.7%
U 1512982
6.6%
t 619191
 
2.7%
Other values (45) 3528449
15.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22883231
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 3607436
15.8%
e 3511207
15.3%
n 2050908
9.0%
s 1757954
7.7%
c 1666318
7.3%
d 1549984
6.8%
p 1546225
6.8%
f 1532577
6.7%
U 1512982
6.6%
t 619191
 
2.7%
Other values (45) 3528449
15.4%
Distinct51
Distinct (%)< 0.1%
Missing1927163
Missing (%)92.9%
Memory size15.8 MiB
2024-03-26T23:42:05.856198image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length53
Median length11
Mean length11.656053
Min length1

Characters and Unicode

Total characters1728173
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified
ValueCountFrequency (%)
unspecified 138219
85.8%
other 2813
 
1.7%
vehicular 2773
 
1.7%
driver 2131
 
1.3%
too 2011
 
1.2%
closely 2011
 
1.2%
following 1957
 
1.2%
inattention/distraction 1950
 
1.2%
fatigued/drowsy 853
 
0.5%
pavement 410
 
0.3%
Other values (79) 5908
 
3.7%
2024-03-26T23:42:06.708851image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 295337
17.1%
i 294017
17.0%
n 151554
8.8%
s 145163
8.4%
c 144599
8.4%
d 140321
8.1%
p 139879
8.1%
f 139124
8.1%
U 138882
8.0%
o 17264
 
1.0%
Other values (45) 122033
7.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1728173
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 295337
17.1%
i 294017
17.0%
n 151554
8.8%
s 145163
8.4%
c 144599
8.4%
d 140321
8.1%
p 139879
8.1%
f 139124
8.1%
U 138882
8.0%
o 17264
 
1.0%
Other values (45) 122033
7.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1728173
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 295337
17.1%
i 294017
17.0%
n 151554
8.8%
s 145163
8.4%
c 144599
8.4%
d 140321
8.1%
p 139879
8.1%
f 139124
8.1%
U 138882
8.0%
o 17264
 
1.0%
Other values (45) 122033
7.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1728173
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 295337
17.1%
i 294017
17.0%
n 151554
8.8%
s 145163
8.4%
c 144599
8.4%
d 140321
8.1%
p 139879
8.1%
f 139124
8.1%
U 138882
8.0%
o 17264
 
1.0%
Other values (45) 122033
7.1%

CONTRIBUTING FACTOR VEHICLE 4
Categorical

IMBALANCE  MISSING 

Distinct41
Distinct (%)0.1%
Missing2041953
Missing (%)98.4%
Memory size15.8 MiB
Unspecified
31577 
Other Vehicular
 
614
Following Too Closely
 
390
Driver Inattention/Distraction
 
275
Fatigued/Drowsy
 
170
Other values (36)
 
448

Length

Max length43
Median length11
Mean length11.489425
Min length5

Characters and Unicode

Total characters384597
Distinct characters51
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified

Common Values

ValueCountFrequency (%)
Unspecified 31577
 
1.5%
Other Vehicular 614
 
< 0.1%
Following Too Closely 390
 
< 0.1%
Driver Inattention/Distraction 275
 
< 0.1%
Fatigued/Drowsy 170
 
< 0.1%
Pavement Slippery 116
 
< 0.1%
Reaction to Uninvolved Vehicle 41
 
< 0.1%
Unsafe Speed 32
 
< 0.1%
Outside Car Distraction 28
 
< 0.1%
Driver Inexperience 27
 
< 0.1%
Other values (31) 204
 
< 0.1%
(Missing) 2041953
98.4%

Length

2024-03-26T23:42:07.123120image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unspecified 31577
88.1%
other 623
 
1.7%
vehicular 614
 
1.7%
too 395
 
1.1%
closely 395
 
1.1%
following 390
 
1.1%
driver 302
 
0.8%
inattention/distraction 275
 
0.8%
fatigued/drowsy 170
 
0.5%
pavement 119
 
0.3%
Other values (64) 965
 
2.7%

Most occurring characters

ValueCountFrequency (%)
e 66721
17.3%
i 66107
17.2%
n 33651
8.7%
c 32739
8.5%
s 32723
8.5%
p 31939
8.3%
d 31931
8.3%
f 31705
8.2%
U 31684
8.2%
o 3077
 
0.8%
Other values (41) 22320
 
5.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 384597
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 66721
17.3%
i 66107
17.2%
n 33651
8.7%
c 32739
8.5%
s 32723
8.5%
p 31939
8.3%
d 31931
8.3%
f 31705
8.2%
U 31684
8.2%
o 3077
 
0.8%
Other values (41) 22320
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 384597
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 66721
17.3%
i 66107
17.2%
n 33651
8.7%
c 32739
8.5%
s 32723
8.5%
p 31939
8.3%
d 31931
8.3%
f 31705
8.2%
U 31684
8.2%
o 3077
 
0.8%
Other values (41) 22320
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 384597
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 66721
17.3%
i 66107
17.2%
n 33651
8.7%
c 32739
8.5%
s 32723
8.5%
p 31939
8.3%
d 31931
8.3%
f 31705
8.2%
U 31684
8.2%
o 3077
 
0.8%
Other values (41) 22320
 
5.8%

CONTRIBUTING FACTOR VEHICLE 5
Categorical

IMBALANCE  MISSING 

Distinct30
Distinct (%)0.3%
Missing2066358
Missing (%)99.6%
Memory size15.8 MiB
Unspecified
8549 
Other Vehicular
 
178
Following Too Closely
 
98
Driver Inattention/Distraction
 
64
Pavement Slippery
 
49
Other values (25)
 
131

Length

Max length43
Median length11
Mean length11.468078
Min length5

Characters and Unicode

Total characters104004
Distinct characters50
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified

Common Values

ValueCountFrequency (%)
Unspecified 8549
 
0.4%
Other Vehicular 178
 
< 0.1%
Following Too Closely 98
 
< 0.1%
Driver Inattention/Distraction 64
 
< 0.1%
Pavement Slippery 49
 
< 0.1%
Fatigued/Drowsy 41
 
< 0.1%
Reaction to Uninvolved Vehicle 12
 
< 0.1%
Alcohol Involvement 11
 
< 0.1%
Obstruction/Debris 10
 
< 0.1%
Driver Inexperience 10
 
< 0.1%
Other values (20) 47
 
< 0.1%
(Missing) 2066358
99.6%

Length

2024-03-26T23:42:07.533702image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unspecified 8549
88.2%
other 180
 
1.9%
vehicular 178
 
1.8%
too 100
 
1.0%
closely 100
 
1.0%
following 98
 
1.0%
driver 74
 
0.8%
inattention/distraction 64
 
0.7%
pavement 50
 
0.5%
slippery 49
 
0.5%
Other values (47) 251
 
2.6%

Most occurring characters

ValueCountFrequency (%)
e 18109
17.4%
i 17868
17.2%
n 9076
8.7%
c 8869
8.5%
s 8820
8.5%
p 8675
8.3%
d 8634
8.3%
f 8576
8.2%
U 8572
8.2%
o 781
 
0.8%
Other values (40) 6024
 
5.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 104004
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 18109
17.4%
i 17868
17.2%
n 9076
8.7%
c 8869
8.5%
s 8820
8.5%
p 8675
8.3%
d 8634
8.3%
f 8576
8.2%
U 8572
8.2%
o 781
 
0.8%
Other values (40) 6024
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 104004
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 18109
17.4%
i 17868
17.2%
n 9076
8.7%
c 8869
8.5%
s 8820
8.5%
p 8675
8.3%
d 8634
8.3%
f 8576
8.2%
U 8572
8.2%
o 781
 
0.8%
Other values (40) 6024
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 104004
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 18109
17.4%
i 17868
17.2%
n 9076
8.7%
c 8869
8.5%
s 8820
8.5%
p 8675
8.3%
d 8634
8.3%
f 8576
8.2%
U 8572
8.2%
o 781
 
0.8%
Other values (40) 6024
 
5.8%

COLLISION_ID
Real number (ℝ)

UNIQUE 

Distinct2075427
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3159627
Minimum22
Maximum4712252
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 MiB
2024-03-26T23:42:07.934229image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum22
5-th percentile104625.3
Q13154976.5
median3673954
Q34193057.5
95-th percentile4608219.7
Maximum4712252
Range4712230
Interquartile range (IQR)1038081

Descriptive statistics

Standard deviation1505149.9
Coefficient of variation (CV)0.47636949
Kurtosis-0.032800807
Mean3159627
Median Absolute Deviation (MAD)519041
Skewness-1.2236319
Sum6.5575751 × 1012
Variance2.2654762 × 1012
MonotonicityNot monotonic
2024-03-26T23:42:08.321677image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4455765 1
 
< 0.1%
3176288 1
 
< 0.1%
3188747 1
 
< 0.1%
3176436 1
 
< 0.1%
3189909 1
 
< 0.1%
3187402 1
 
< 0.1%
3178392 1
 
< 0.1%
3183441 1
 
< 0.1%
3178566 1
 
< 0.1%
3185340 1
 
< 0.1%
Other values (2075417) 2075417
> 99.9%
ValueCountFrequency (%)
22 1
< 0.1%
23 1
< 0.1%
24 1
< 0.1%
25 1
< 0.1%
26 1
< 0.1%
27 1
< 0.1%
28 1
< 0.1%
29 1
< 0.1%
30 1
< 0.1%
31 1
< 0.1%
ValueCountFrequency (%)
4712252 1
< 0.1%
4712247 1
< 0.1%
4712246 1
< 0.1%
4712245 1
< 0.1%
4712242 1
< 0.1%
4712241 1
< 0.1%
4712237 1
< 0.1%
4712235 1
< 0.1%
4712232 1
< 0.1%
4712231 1
< 0.1%
Distinct1631
Distinct (%)0.1%
Missing13691
Missing (%)0.7%
Memory size15.8 MiB
2024-03-26T23:42:08.567427image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length38
Median length35
Mean length16.886453
Min length1

Characters and Unicode

Total characters34815408
Distinct characters75
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique989 ?
Unique (%)< 0.1%

Sample

1st rowSedan
2nd rowSedan
3rd rowSedan
4th rowSedan
5th rowDump
ValueCountFrequency (%)
vehicle 880306
18.0%
utility 633851
13.0%
station 633808
13.0%
sedan 619493
12.7%
wagon/sport 453517
9.3%
passenger 416219
8.5%
181665
 
3.7%
wagon 180354
 
3.7%
sport 180291
 
3.7%
truck 85920
 
1.8%
Other values (950) 616060
12.6%
2024-03-26T23:42:09.357367image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2832968
 
8.1%
S 2735641
 
7.9%
t 2300987
 
6.6%
i 1938153
 
5.6%
E 1818931
 
5.2%
a 1620452
 
4.7%
e 1611200
 
4.6%
n 1548461
 
4.4%
o 1436044
 
4.1%
T 1141718
 
3.3%
Other values (65) 15830853
45.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34815408
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2832968
 
8.1%
S 2735641
 
7.9%
t 2300987
 
6.6%
i 1938153
 
5.6%
E 1818931
 
5.2%
a 1620452
 
4.7%
e 1611200
 
4.6%
n 1548461
 
4.4%
o 1436044
 
4.1%
T 1141718
 
3.3%
Other values (65) 15830853
45.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34815408
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2832968
 
8.1%
S 2735641
 
7.9%
t 2300987
 
6.6%
i 1938153
 
5.6%
E 1818931
 
5.2%
a 1620452
 
4.7%
e 1611200
 
4.6%
n 1548461
 
4.4%
o 1436044
 
4.1%
T 1141718
 
3.3%
Other values (65) 15830853
45.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34815408
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2832968
 
8.1%
S 2735641
 
7.9%
t 2300987
 
6.6%
i 1938153
 
5.6%
E 1818931
 
5.2%
a 1620452
 
4.7%
e 1611200
 
4.6%
n 1548461
 
4.4%
o 1436044
 
4.1%
T 1141718
 
3.3%
Other values (65) 15830853
45.5%

VEHICLE TYPE CODE 2
Text

MISSING 

Distinct1819
Distinct (%)0.1%
Missing396691
Missing (%)19.1%
Memory size15.8 MiB
2024-03-26T23:42:09.598120image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length38
Median length30
Mean length16.08444
Min length1

Characters and Unicode

Total characters27001529
Distinct characters73
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1080 ?
Unique (%)0.1%

Sample

1st rowSedan
2nd rowPick-up Truck
3rd rowSedan
4th rowTractor Truck Diesel
5th rowSedan
ValueCountFrequency (%)
vehicle 653746
17.1%
utility 466778
12.2%
station 466750
12.2%
sedan 435556
11.4%
wagon/sport 326546
8.5%
passenger 318612
8.3%
141501
 
3.7%
wagon 140256
 
3.7%
sport 140204
 
3.7%
truck 85272
 
2.2%
Other values (1009) 655810
17.1%
2024-03-26T23:42:10.030568image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2165263
 
8.0%
S 2031182
 
7.5%
t 1665937
 
6.2%
E 1438671
 
5.3%
i 1431599
 
5.3%
e 1189958
 
4.4%
a 1165845
 
4.3%
n 1107454
 
4.1%
o 1060004
 
3.9%
T 919371
 
3.4%
Other values (63) 12826245
47.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 27001529
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2165263
 
8.0%
S 2031182
 
7.5%
t 1665937
 
6.2%
E 1438671
 
5.3%
i 1431599
 
5.3%
e 1189958
 
4.4%
a 1165845
 
4.3%
n 1107454
 
4.1%
o 1060004
 
3.9%
T 919371
 
3.4%
Other values (63) 12826245
47.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 27001529
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2165263
 
8.0%
S 2031182
 
7.5%
t 1665937
 
6.2%
E 1438671
 
5.3%
i 1431599
 
5.3%
e 1189958
 
4.4%
a 1165845
 
4.3%
n 1107454
 
4.1%
o 1060004
 
3.9%
T 919371
 
3.4%
Other values (63) 12826245
47.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 27001529
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2165263
 
8.0%
S 2031182
 
7.5%
t 1665937
 
6.2%
E 1438671
 
5.3%
i 1431599
 
5.3%
e 1189958
 
4.4%
a 1165845
 
4.3%
n 1107454
 
4.1%
o 1060004
 
3.9%
T 919371
 
3.4%
Other values (63) 12826245
47.5%

VEHICLE TYPE CODE 3
Text

MISSING 

Distinct260
Distinct (%)0.2%
Missing1932530
Missing (%)93.1%
Memory size15.8 MiB
2024-03-26T23:42:10.268561image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length35
Median length30
Mean length17.679552
Min length2

Characters and Unicode

Total characters2526355
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique152 ?
Unique (%)0.1%

Sample

1st rowSedan
2nd rowStation Wagon/Sport Utility Vehicle
3rd rowSedan
4th rowSedan
5th rowSedan
ValueCountFrequency (%)
vehicle 64246
18.5%
utility 49457
14.2%
station 49455
14.2%
sedan 47158
13.6%
wagon/sport 36096
10.4%
passenger 27716
8.0%
13439
 
3.9%
wagon 13359
 
3.8%
sport 13358
 
3.8%
truck 4339
 
1.3%
Other values (216) 28474
8.2%
2024-03-26T23:42:10.692194image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
204635
 
8.1%
S 200575
 
7.9%
t 181889
 
7.2%
i 150270
 
5.9%
a 122930
 
4.9%
e 122469
 
4.8%
n 120231
 
4.8%
E 116403
 
4.6%
o 111274
 
4.4%
T 77028
 
3.0%
Other values (52) 1118651
44.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2526355
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
204635
 
8.1%
S 200575
 
7.9%
t 181889
 
7.2%
i 150270
 
5.9%
a 122930
 
4.9%
e 122469
 
4.8%
n 120231
 
4.8%
E 116403
 
4.6%
o 111274
 
4.4%
T 77028
 
3.0%
Other values (52) 1118651
44.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2526355
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
204635
 
8.1%
S 200575
 
7.9%
t 181889
 
7.2%
i 150270
 
5.9%
a 122930
 
4.9%
e 122469
 
4.8%
n 120231
 
4.8%
E 116403
 
4.6%
o 111274
 
4.4%
T 77028
 
3.0%
Other values (52) 1118651
44.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2526355
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
204635
 
8.1%
S 200575
 
7.9%
t 181889
 
7.2%
i 150270
 
5.9%
a 122930
 
4.9%
e 122469
 
4.8%
n 120231
 
4.8%
E 116403
 
4.6%
o 111274
 
4.4%
T 77028
 
3.0%
Other values (52) 1118651
44.3%

VEHICLE TYPE CODE 4
Text

MISSING 

Distinct101
Distinct (%)0.3%
Missing2043115
Missing (%)98.4%
Memory size15.8 MiB
2024-03-26T23:42:10.905868image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length35
Median length30
Mean length17.97682
Min length2

Characters and Unicode

Total characters580867
Distinct characters57
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)0.1%

Sample

1st rowStation Wagon/Sport Utility Vehicle
2nd rowSedan
3rd rowStation Wagon/Sport Utility Vehicle
4th rowSedan
5th rowSedan
ValueCountFrequency (%)
vehicle 14893
18.9%
utility 11719
14.8%
station 11719
14.8%
sedan 11398
14.4%
wagon/sport 8867
11.2%
passenger 5970
7.6%
2859
 
3.6%
sport 2852
 
3.6%
wagon 2852
 
3.6%
truck 798
 
1.0%
Other values (103) 5046
 
6.4%
2024-03-26T23:42:11.316686image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
46717
 
8.0%
S 46409
 
8.0%
t 44549
 
7.7%
i 36568
 
6.3%
a 29793
 
5.1%
e 29584
 
5.1%
n 29274
 
5.0%
o 27071
 
4.7%
E 24669
 
4.2%
l 17966
 
3.1%
Other values (47) 248267
42.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 580867
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
46717
 
8.0%
S 46409
 
8.0%
t 44549
 
7.7%
i 36568
 
6.3%
a 29793
 
5.1%
e 29584
 
5.1%
n 29274
 
5.0%
o 27071
 
4.7%
E 24669
 
4.2%
l 17966
 
3.1%
Other values (47) 248267
42.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 580867
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
46717
 
8.0%
S 46409
 
8.0%
t 44549
 
7.7%
i 36568
 
6.3%
a 29793
 
5.1%
e 29584
 
5.1%
n 29274
 
5.0%
o 27071
 
4.7%
E 24669
 
4.2%
l 17966
 
3.1%
Other values (47) 248267
42.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 580867
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
46717
 
8.0%
S 46409
 
8.0%
t 44549
 
7.7%
i 36568
 
6.3%
a 29793
 
5.1%
e 29584
 
5.1%
n 29274
 
5.0%
o 27071
 
4.7%
E 24669
 
4.2%
l 17966
 
3.1%
Other values (47) 248267
42.7%

VEHICLE TYPE CODE 5
Text

MISSING 

Distinct70
Distinct (%)0.8%
Missing2066635
Missing (%)99.6%
Memory size15.8 MiB
2024-03-26T23:42:11.527124image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length35
Median length30
Mean length18.214058
Min length2

Characters and Unicode

Total characters160138
Distinct characters54
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)0.4%

Sample

1st rowStation Wagon/Sport Utility Vehicle
2nd rowStation Wagon/Sport Utility Vehicle
3rd rowSedan
4th rowSedan
5th rowStation Wagon/Sport Utility Vehicle
ValueCountFrequency (%)
vehicle 4020
18.5%
utility 3326
15.3%
station 3326
15.3%
sedan 3182
14.7%
wagon/sport 2524
11.6%
passenger 1487
 
6.8%
804
 
3.7%
wagon 804
 
3.7%
sport 802
 
3.7%
truck 245
 
1.1%
Other values (68) 1196
 
5.5%
2024-03-26T23:42:11.940182image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12934
 
8.1%
S 12722
 
7.9%
t 12689
 
7.9%
i 10410
 
6.5%
a 8409
 
5.3%
e 8354
 
5.2%
n 8289
 
5.2%
o 7724
 
4.8%
E 6129
 
3.8%
l 5114
 
3.2%
Other values (44) 67364
42.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 160138
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
12934
 
8.1%
S 12722
 
7.9%
t 12689
 
7.9%
i 10410
 
6.5%
a 8409
 
5.3%
e 8354
 
5.2%
n 8289
 
5.2%
o 7724
 
4.8%
E 6129
 
3.8%
l 5114
 
3.2%
Other values (44) 67364
42.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 160138
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
12934
 
8.1%
S 12722
 
7.9%
t 12689
 
7.9%
i 10410
 
6.5%
a 8409
 
5.3%
e 8354
 
5.2%
n 8289
 
5.2%
o 7724
 
4.8%
E 6129
 
3.8%
l 5114
 
3.2%
Other values (44) 67364
42.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 160138
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
12934
 
8.1%
S 12722
 
7.9%
t 12689
 
7.9%
i 10410
 
6.5%
a 8409
 
5.3%
e 8354
 
5.2%
n 8289
 
5.2%
o 7724
 
4.8%
E 6129
 
3.8%
l 5114
 
3.2%
Other values (44) 67364
42.1%

Interactions

2024-03-26T23:41:29.463165image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:10.600376image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:12.972692image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:15.458395image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:17.959098image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:20.789237image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:24.487920image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:27.066513image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:29.746524image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:10.877302image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:13.278363image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:15.752554image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:18.245017image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:21.242351image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:24.914911image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:27.372588image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:30.084955image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:11.172326image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:13.599226image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:16.077320image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:18.553739image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:21.749044image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:25.257738image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:27.701285image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:30.401700image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:11.450412image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:13.912354image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:16.387975image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:18.843937image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:22.194801image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:25.575691image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:28.013309image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:30.900876image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:11.769595image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:14.242135image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:16.720295image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:19.169736image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:22.667274image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:25.892481image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:28.328191image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:31.187937image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:12.078123image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:14.542518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:17.048424image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:19.493452image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:23.110049image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:26.183222image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:28.605561image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:31.465904image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:12.382109image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:14.836885image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:17.367255image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:19.794408image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:23.557520image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:26.481412image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:28.874265image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:31.748249image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:12.686672image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:15.157845image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:17.684945image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:20.336995image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:24.015707image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:26.774288image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-26T23:41:29.150587image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Missing values

2024-03-26T23:41:32.927777image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-26T23:41:37.634491image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-26T23:41:49.443660image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

CRASH DATECRASH TIMEBOROUGHZIP CODELATITUDELONGITUDELOCATIONON STREET NAMECROSS STREET NAMEOFF STREET NAMENUMBER OF PERSONS INJUREDNUMBER OF PERSONS KILLEDNUMBER OF PEDESTRIANS INJUREDNUMBER OF PEDESTRIANS KILLEDNUMBER OF CYCLIST INJUREDNUMBER OF CYCLIST KILLEDNUMBER OF MOTORIST INJUREDNUMBER OF MOTORIST KILLEDCONTRIBUTING FACTOR VEHICLE 1CONTRIBUTING FACTOR VEHICLE 2CONTRIBUTING FACTOR VEHICLE 3CONTRIBUTING FACTOR VEHICLE 4CONTRIBUTING FACTOR VEHICLE 5COLLISION_IDVEHICLE TYPE CODE 1VEHICLE TYPE CODE 2VEHICLE TYPE CODE 3VEHICLE TYPE CODE 4VEHICLE TYPE CODE 5
009/11/20212:39NaNNaNNaNNaNNaNWHITESTONE EXPRESSWAY20 AVENUENaN2.00.0000020Aggressive Driving/Road RageUnspecifiedNaNNaNNaN4455765SedanSedanNaNNaNNaN
103/26/202211:45NaNNaNNaNNaNNaNQUEENSBORO BRIDGE UPPERNaNNaN1.00.0000010Pavement SlipperyNaNNaNNaNNaN4513547SedanNaNNaNNaNNaN
206/29/20226:55NaNNaNNaNNaNNaNTHROGS NECK BRIDGENaNNaN0.00.0000000Following Too CloselyUnspecifiedNaNNaNNaN4541903SedanPick-up TruckNaNNaNNaN
309/11/20219:35BROOKLYN11208.040.667202-73.866500(40.667202, -73.8665)NaNNaN1211 LORING AVENUE0.00.0000000UnspecifiedNaNNaNNaNNaN4456314SedanNaNNaNNaNNaN
412/14/20218:13BROOKLYN11233.040.683304-73.917274(40.683304, -73.917274)SARATOGA AVENUEDECATUR STREETNaN0.00.0000000NaNNaNNaNNaNNaN4486609NaNNaNNaNNaNNaN
504/14/202112:47NaNNaNNaNNaNNaNMAJOR DEEGAN EXPRESSWAY RAMPNaNNaN0.00.0000000UnspecifiedUnspecifiedNaNNaNNaN4407458DumpSedanNaNNaNNaN
612/14/202117:05NaNNaN40.709183-73.956825(40.709183, -73.956825)BROOKLYN QUEENS EXPRESSWAYNaNNaN0.00.0000000Passing Too CloselyUnspecifiedNaNNaNNaN4486555SedanTractor Truck DieselNaNNaNNaN
712/14/20218:17BRONX10475.040.868160-73.831480(40.86816, -73.83148)NaNNaN344 BAYCHESTER AVENUE2.00.0000020UnspecifiedUnspecifiedNaNNaNNaN4486660SedanSedanNaNNaNNaN
812/14/202121:10BROOKLYN11207.040.671720-73.897100(40.67172, -73.8971)NaNNaN2047 PITKIN AVENUE0.00.0000000Driver InexperienceUnspecifiedNaNNaNNaN4487074SedanNaNNaNNaNNaN
912/14/202114:58MANHATTAN10017.040.751440-73.973970(40.75144, -73.97397)3 AVENUEEAST 43 STREETNaN0.00.0000000Passing Too CloselyUnspecifiedNaNNaNNaN4486519SedanStation Wagon/Sport Utility VehicleNaNNaNNaN
CRASH DATECRASH TIMEBOROUGHZIP CODELATITUDELONGITUDELOCATIONON STREET NAMECROSS STREET NAMEOFF STREET NAMENUMBER OF PERSONS INJUREDNUMBER OF PERSONS KILLEDNUMBER OF PEDESTRIANS INJUREDNUMBER OF PEDESTRIANS KILLEDNUMBER OF CYCLIST INJUREDNUMBER OF CYCLIST KILLEDNUMBER OF MOTORIST INJUREDNUMBER OF MOTORIST KILLEDCONTRIBUTING FACTOR VEHICLE 1CONTRIBUTING FACTOR VEHICLE 2CONTRIBUTING FACTOR VEHICLE 3CONTRIBUTING FACTOR VEHICLE 4CONTRIBUTING FACTOR VEHICLE 5COLLISION_IDVEHICLE TYPE CODE 1VEHICLE TYPE CODE 2VEHICLE TYPE CODE 3VEHICLE TYPE CODE 4VEHICLE TYPE CODE 5
207541703/05/202420:40QUEENS11375.040.722622-73.849144(40.722622, -73.849144)YELLOWSTONE BOULEVARDGERARD PLACENaN0.00.0000000Driver Inattention/DistractionUnspecifiedNaNNaNNaN4707384SedanTractor Truck DieselNaNNaNNaN
207541803/05/20247:30NaNNaN40.772953-73.920280(40.772953, -73.92028)26 STREETHOYT AVENUE NORTHNaN0.00.0000000Turning ImproperlyDriver Inattention/DistractionNaNNaNNaN4707737Box TruckGarbage or RefuseNaNNaNNaN
207541903/05/202414:50NaNNaN40.646000-73.971750(40.646, -73.97175)CHURCH AVENUEEAST 8 STREETNaN2.00.0200000NaNNaNNaNNaNNaN4707432NaNNaNNaNNaNNaN
207542003/05/202414:00NaNNaN40.722250-74.005920(40.72225, -74.00592)CANAL STREETAVENUE OF THE AMERICASNaN1.00.0000010Following Too CloselyFollowing Too CloselyNaNNaNNaN4707476SedanNaNNaNNaNNaN
207542102/06/202412:37BROOKLYN11235.040.586670-73.966156(40.58667, -73.966156)OCEAN PARKWAYAVENUE ZNaN1.00.0100000UnspecifiedNaNNaNNaNNaN4707884E-BikeNaNNaNNaNNaN
207542203/05/202417:22QUEENS11436.040.680477-73.792100(40.680477, -73.7921)SUTPHIN BOULEVARDFOCH BOULEVARDNaN1.00.0000010Failure to Yield Right-of-WayUnspecifiedNaNNaNNaN4707511Station Wagon/Sport Utility VehicleStation Wagon/Sport Utility VehicleNaNNaNNaN
207542303/05/202417:00BROOKLYN11204.040.610786-73.978820(40.610786, -73.97882)NaNNaN161 AVENUE O1.00.0000010Driver InexperienceUnspecifiedUnspecifiedUnspecifiedNaN4707419AmbulancePKVanPKNaN
207542403/03/202417:50NaNNaN40.675053-73.947235(40.675053, -73.947235)SAINT MARKS AVENUENaNNaN1.00.0000010Aggressive Driving/Road RageUnspecifiedNaNNaNNaN4707855Station Wagon/Sport Utility VehiclePKNaNNaNNaN
207542503/05/202414:30BROOKLYN11207.040.677900-73.892586(40.6779, -73.892586)MILLER AVENUEFULTON STREETNaN1.00.0100000Pedestrian/Bicyclist/Other Pedestrian Error/ConfusionNaNNaNNaNNaN4707872Station Wagon/Sport Utility VehicleNaNNaNNaNNaN
207542603/05/20248:00QUEENS11385.040.706512-73.878136(40.706512, -73.878136)EDSALL AVENUE73 STREETNaN1.00.0000010Failure to Yield Right-of-WayUnspecifiedNaNNaNNaN4707447SedanStation Wagon/Sport Utility VehicleNaNNaNNaN